Skip to content

Conversation

@provos
Copy link

@provos provos commented Nov 30, 2025

Adds MPS device support for both image and video predictors on Apple Silicon.

Changes:

  • Add get_default_device() utility that detects MPS availability
  • Fix device mismatches (coords cache, freqs_cis cache)
  • Add MPS workaround for complex tensor repeat() in RoPE
  • Make torch._assert_async conditional on CUDA
  • Fix MPS memory leak in video predictor via synchronization points

Performance of the Video predictor:

  • ~3x faster than CPU
  • Runs with ~38GB peak memory. This is due to the way that MPS caches graphs. Before adding the synchronization points, running the video predictor would consume all available memory.

this has prebuilt wheels for apple silicon. bump numpy from 1.26 to 1.26.4 to meet dependency requirements for decord2
Allows systems without CUDA to fallback to CPU.
The pin_memory() optimization is only available for CUDA backends.
CUDA handles this internally but we need to handle it directly for CPU
…or cpu

introduce workarounds for torch operations not available on mps like repeats of complex tensors
forcefully flush pending operations with synchronize and empty the cache.
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 30, 2025
Copy link

@mattiagaggi mattiagaggi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to overlap with your other pull request :)

Also I'd make some tests

@provos
Copy link
Author

provos commented Dec 3, 2025

This is a continuation of the other PR. I just didn't want to clutter things when I didn't know yet how hard MPS support was going to be. I can add the end-to-end test for the video predictor to the PR but didn't want to pollute your repo. Is there a place you would like tests to go?

Copy link

@drduhe drduhe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesmoe and does a better job that was I was getting at in my PR, #326 in addressing the Mac deployments. I am going to remove that logic from my PR now seeing that this is in the pipeline.

@provos
Copy link
Author

provos commented Dec 11, 2025

@drduhe do you know why the import is stuck? i noticed the same with my other PR which was approved a week(?) ago?

@drduhe
Copy link

drduhe commented Dec 11, 2025

@drduhe do you know why the import is stuck? i noticed the same with my other PR which was approved a week(?) ago?

@provos I am not a maintainer on the project so it probably needs to get approval from someone with more pull than I have =). But FWIW I did create a tests directory as part of my PR as a starting point for where we can put unit tests. @mattiagaggi might have more information on the process, SLA, and workflow requirements.

@nico1996it
Copy link

It would be great to have this merged...

@dario-k
Copy link

dario-k commented Jan 12, 2026

@provos this looks awesome! I currently maintain my own custom implementation for this, but I'd love to deprecate it in favor of this upstream solution. Is this waiting on anything specific, or is it just pending review?

@drduhe
Copy link

drduhe commented Jan 12, 2026

It seems that the folks managing this repository take a while to churn through the open PRs to provide feedback or approvals and the queue is actually increasing quite a bit weekly - but that could be due to the holidays and the start of the new year, along with the fact that most of that group are more academically focused it seems.

I have reached out to several of the folks who are owners own the repository to see if there would be a way we could get more people added to this group so that we get a bit faster turn around. I for one, would be super happy to contribute in this capacity, but I haven't gotten a response yet. I would like to see this package get some love and regular attention as it has a ton of interest from the community.

If we can get some buy in or increased velocity from the managing team it would be awesome, if not perhaps we could set up a more community focused fork of this repository that would proctor ourselves =)

@provos
Copy link
Author

provos commented Jan 12, 2026

@dario-k somebody at Meta/FAIR has to approve this. It does not seem that anyone from there has looked at this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants